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Abstract 

We consider transformations between attractor basins of binary cylindrical cellu- 
lar automata resulting from mutations. A r-point mutation of a state consists in 
toggling r sites in that state. Results of such mutations are described by a rule- 
dependent probability matrix. The structure of this matrix is studied in relation to 
the structure of the state transition diagram and several theorems relating these are 
proved for the case of additive rules. It is shown that the steady state of the Markov 
process defined by the probability matrix is always the uniform distribution over 
the state transition diagram. Some results on eigenvalues are also obtained. 
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1 Introduction 



In one form or another, the dialectical view of change formalized by Hegel in 
the early nineteenth century has dominated most of nineteenth and twentieth 
century thought. In line with this view, evolutionary processes were initially 
conceptualized in terms of gradual optimization in which small variations, 
or mutational changes struggled for survival in environments with limited 
resources. In recent theorizing about complex adaptive systems, however, the 
idea of sudden mutational change has come to play a significant role. This 
is change that can occur suddenly, in apparently unpredictable jumps, rather 
than as a gradual transformation of quantity into quality. 
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Paradigmatic examples of such mutational change are state transitions in 
quantum systems and genetic mutations, and mathematical models exhibiting 
the potential for such mutational jumps have proliferated. In continuous dy- 
namical systems, research has demonstrated there may be many metastable at- 
tractors with random jumps between attractors induced by noise [11,19,21,24] 
and, under certain circumstances, these jumps may be controllable [7]. Simi- 
lar results on noise induced transitions between metastable states have been 
obtained by Antonelli and Zatawniak [2]. In a model of the evolution of a 
dimorphic clone in the presence of both internal developmental noise and en- 
vironmental fluctuations they show that stationary solutions "are segregated 
into disjoint invariant sets, providing clonal type stability, growth canaliza- 
tion, and variability within each clonal type... [while] the interaction between 
the environmental and developmental noise can trigger transitions... from one 
clonal type to another." 

Genetic mutation and evolution have been taken as a generic model in many 
studies of change in complex systems. Work at the forefront of complex systems 
research has focused on such mutational jumps [4,14,13,20,25,28], emphasiz- 
ing a new approach to evolutionary change that Crutchfield [4] has called 
epochal evolution. An important aspect in theorizing about epochal evolution 
processes is categorization of abstract "genotypes" into fitness classes, with 
all genotypes in a given class having more or less equal fitness. Populations 
are described by a probability distribution over these classes, with selection 
acting between, but not within, each class. 

Since the space of possible genotypes is extremely large, what Scott [27] terms 
"immense", at any given historical period only a small number of genotypes 
will actually be manifest in a population. Innovations arise when random drift 
within an existing fitness class "discovers" a portal to a previously unoccupied 
class with higher fitness. Exploitation of the advantage of this new class of 
genotypes leads to rapid change in the population distribution with the highest 
fitness class dominating — a new evolutionary epoch has arisen. 

An explicit feature of the theory of epochal evolution is the idea of modu- 
larity in the space of genotypes (or, more generally, in an appropriate state 
space). The idea of modularity has been present in ecology at least since May's 
seminal work on stability and complexity of ecosystems [23]. It was May's sug- 
gestion that in the case of complex adaptive systems there will be only very 
small regions in the system parameter space where the system has long term 
stability. In genetic terms, it might be posited that only certain prototypic 
genotypes are compatible with an organism's survival and that fitness classes 
can be defined as the classes of genotypes that are related to these prototypes 
by neutral mutations. (It must be emphasized, of course, that the idea of a 
prototype for a fitness class is an idealization. There may well be no genotype 
in a class that actually matches the prototype, which can be taken as a fic- 
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tional genotype optimized to an idealized environment in which there are no 
fluctuations of environmental parameters.) 

Portals between fitness classes appear at points where a jump between one 
fitness class and another is possible via only a single or a small number of 
mutations. While it is usually assumed that each genotype within a given 
fitness class has exactly the same fitness as every other genotype in the class, 
this is not a necessary assumption. All that is required is that the time average 
fitness of each genotype in any given class, when weighted by the spectrum of 
environmental fluctuations, be equal. 

For example, if the environment can be modelled as fluctuating between n 
different states labelled ej, 1 < j < n, and a given fitness class F(r) contains 
m genotypes 1 < i < m, then the m x n matrix i^-(r) with i,j entry equal 
to the fitness of genotype gi in environment Cj describes the overall fitness of 
the class F(r). The condition that F(r) be a fitness class is that 

n n 

EWi = EWPi. (!) 

3=1 3=1 

for all % and k, where pj is the probability that environmental conditions ej will 
occur. Then consideration can focus on the dynamics introduced by mutations 
that result in transitions between fitness classes. 

One means of representing a system with modular classes undergoing epochal 
evolution is graphically, as a set of vertices labelled either by genotypes or 
by fitness classes, with an edge connecting two vertices if there is a mutation 
that relates the corresponding genotypes or fitness classes. This representa- 
tion allows application of the tools of graph theory and network dynamics 
[1,6,15,26,31]. Network dynamics, in both discrete and continuous forms, is 
emerging as a major mathematical tool in many areas of biology, social sci- 
ence, and economics. In real cases, however, the networks encountered tend 
to be highly complex and analytically intractable. Thus, the study of simple 
"toy" models has become important as a means of gaining insight into real 
world cases where general underlying principles and laws might be masked by 
the high level of complexity. Two cases of simple model systems are found in 
cellular automata (CA), and in random Boolean networks [12,20,30]. This pa- 
per considers cellular automata as potential models of complex systems with 
modular network structures that are subject to mutational transitions. 

As a basis for modelling mutational jumps between fitness classes (or more 
generally, between modular elements arising in network dynamics), however, 
cellular automata are insufficient. While the attractor basins of a cellular au- 
tomaton can be used as models of modular units such as fitness classes, there 
is no mechanism providing for transitions between basins. Cellular automata 
dynamics are completely deterministic — every state lies in a basin of attrac- 
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tion, and iteration of an update rule takes each state to a specific attractor. 
Thus, interest in cellular automata has usually focused on elucidation of at- 
tractor basins [33,36], exploration of space-time patterns generated [16,32], 
and determination of the mathematical properties of various types of rules 
[10,18,22] (see [12] for extensive review). 

Recently, however, Wuensche [35] has pointed out the possibility of using 
cellular automata attractor basins as models of modular sub-networks and 
studying either changes in the network topology of state transition diagrams 
introduced by perturbations of the generating rules, or transitions between 
attractor basins induced by mutations in the states themselves. The first ap- 
proach has been explored using probabilistic cellular automata [8], but little 
attention has focused on the second. 

In the present paper this second approach is taken up with a study of mu- 
tationally induced transitions between basins of attraction of simple cellular 
automata. There are two ways to view such transitions. The first, which is the 
focus of this paper, is to introduce point mutations by toggling one of more 
sites in a given state and study the nature of the transitions this introduces 
between attractor basins of a given CA rule. The second method, to be treated 
in a subsequent paper, is based on the specification of a probability matrix 
that arbitrarily fixes the probability of a transition between distinct basins 
of attraction. In this approach, one studies the effect of such transitions and 
their relation to a defined cost, or fitness function. 

Consideration is limited to binary valued "cylindrical" cellular automata [17], 
that is, cellular automata rules defined on binary strings of fixed length with 
periodic boundary conditions. In contrast to the usual convention, however, 
in which neighbourhoods are taken as symmetric about a central mapping 
site, left-justified neighbourhoods are used in this paper. That is, if {i,i + 
1, . . . ,i + k — 1} denotes a fc-site neighbourhood, then the value assigned by 
this neighbourhood at the next iteration of the CA rule appears at site i. 
This has certain advantages when considering mappings of half-infinite binary 
strings as maps of the unit interval. The main effect of this different neigh- 
bourhood choice is that some cycle periods are changed from the symmetric 
neighbourhood case. 

It is also assumed that mutations occur on a much faster time scale than CA 
rule iterations. This means that the entire attractor basin is important rather 
than just the attractor itself. Physically this corresponds to systems whose 
natural dynamics operate on a time scale that is orders of magnitude slower 
than environmental fluctuations that might induce mutations. The opposite 
case, in which the cellular automata dynamics operates on a much faster time 
scale than that of mutation leads to a situation in which only the attractors 
are relevant since any mutation to a state not on an attractor will iterate 
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quickly to the attractor. 



2 Point Mutations on E, 



Given an n-digit binary string \x a mutation /j, — > // is produced by randomly 
toggling one or more of the digits of ji. A r-point mutation corresponds to 
toggling r digits. The effect of such a mutation on elements of E n , the state 
space of n-digit binary sequences, is described by a 2 n x 2 n (0,1) matrix T n (I, r) 
defined by 

[T (I t)] = { 1 3 a r ~ digit toggle J * (2) 
1,7 1 otherwise 

where the indices % and j are expressed as n-digit binary strings. If jo ■ ■ ■ jn-i 
is the binary form of j then toggling r digits is equivalent to the site-wise 
addition mod(2) of an n-digit string containing r ones. It is easy to see that 
T„(7,0) is the 2" x 2™ identity matrix I while T n (I,n) is the 2 n x 2 n anti- 
identity 7*. T n (I,r) is symmetric and each row and each column of T„(J, r) 
contains 

n! 



r!(n — r)! 



ones. 



Graph theoretically, T„(J, r) is the adjacency matrix of the n-hypercube H n {r) 
with edges connecting vertices that are separated by Hamming distance r. 
Note also that since T%(1, 1) allows for the case in which the same site is 
toggled twice, T n (I,2) ^ T n 2 (J, 1). Instead, 2T n (I, 2) = T n 2 (7, 1) - ni\ Since 
each row and column of T n (I, r) contains the same number of ones, the matrix 

r B (/,r) = -^T B (/,r) (3) 

is a probability matrix with i,j element equal to the probability that a r-point 
mutation of the string j . . . j n -i will yield the string i . . . i n -i- Because all 
non-zero entries in this matrix are equal, it defines a Markov process with 
steady state probability vector 2 _ ™1 where 1 is the vector consisting of all 
ones. 

Theorem 1 The matrix T n+1 (I, r) is iteratively generated from the 2x2 iden- 
tity I and anti-identity I* matrix by the recursion 

, T n(I,r) T n (I,r-l)\ 
T n+1 (/,r)= . (4) 

T n J,r-1 T n (I,r) 
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PROOF. To construct T n+1 (I, r) write out a 2 n+1 x 2 n+1 matrix with indices 
ranging from to 2 n+1 — 1 in ascending numerical order. In binary form the 
first 2 n of these indices will consist of strings of n + 1 digits that begin with a 
while the second 2™ indices will consist of strings of n + 1 digits that begin 
with a 1. This provides a partition of the matrix into four 2 n x 2 n blocks. 
Now consider a toggle of r digits in the strings labelling the columns of the 
matrix. If the first digit is not toggled then the effect on the remaining n 
digits is identical to a r-point mutation on n-digit strings. Thus, the first and 
fourth quadrant of the matrix contain T n (J, r). On the other hand, if the first 
digit is toggled, the effect on the remaining n digits is identical to a (t — 1)- 
point mutation on the remaining n-digit strings. Thus the second and third 
quadrants of the matrix contain T n (I, r — 1). □ 



Corollary 2 



T n+1 (I,r) 



/ ^ ±i T n (/,r) -r_T B (/,r-l)^ 



\ra+l 



^T B (J,r-l) 



n—T+l 



T n (I,r) 



(5) 



Theorem 3 For all n and r there exists a set of permutation matrices 

{P s (n,r)|l <*<(?)} , 

such that 

T n (I,r) = J2Ps(n,r) . 

s=l 



(6) 



PROOF. By Theorem 1, 



T n (I,r) = 



T n _i(/,r) T n _i(/,r-l) 
yT n ^(I,r-l) r n _!(/,r) 

Suppose that there are sets of permutation matrices {Qi} and {Q'j} such that 



T n _i(/,r) 



E <2i 



T^/, r-i) = E Q'i 

i=l 



(7) 



Since each row and column of T n (I,r) contains (^j ones, the summation in- 
dices are a = ( n ~ 1 ^) and b = Thus, T n (I,r) can be written in the form 



T„(I,t) 



ttQi o ^ 




( 


b \ 

E Q'i 


i=i 


+ 




i=i 


o E Qi j 

\ i=i / 








/ 



(8) 
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The first matrix defines a set of permutations on {0, . . . , 2 n — 1} in terms of 
the permutations {Qi} defined on {0, ... , 2 n_1 — 1} by 

1 Qi(x - 2 n ~ l ) + 2 n ~ l x E {2 n ~\...,2 n - 1} W 



The second matrix defines a set of permutations on {0, . . . , 2 n — 1} in 

terms of the {Q'i} by 



' " • " 1 U)',U -2" ) ,r, {2" 1 2" 1} • ' 



Since ( r 



n— 1 

T-l 



this expresses T n (I, r) as a sum of ("J permutations, 
so long as (7) is satisfied. But for all r > 0, T T (I, r) = I* while for r = and 
all n, T n (I,0) = I. Since both I and I* are permutation matrices the result 
follows by induction. □ 

Let n be an n-digit binary string. The parity 7r(/i) of fi is defined as if /i 
contains an even number of ones and 1 if \x contains an odd number of ones. A 
string will also be referred to as having even or odd parity in these two cases. 
On this basis a partition of the state space E n is defined by E n = U E^ 
where 

to if.W = o 

l^ 1 if J r(/ 1 )=l 1 ' 

If r is odd then a r-point mutation takes elements of E^ to E^ and vice 
versa while if r is even it takes elements of E^ to E^f 1 and elements of E^ ) 
to E^°\ In addition, for r > no element of E n will mutate to itself so the 
graph H n {r) contains no loops. Since each mutation is reversible, however, the 
shortest cycle period in H n {r) is always two. 

Lemma 4 Let M n be any 2™ x 2 n matrix with indices labelled in ascending 
order from to 2™ — 1 . Then there is a permutation matrix P n such that the 
indices i,j of [P" 1 M n P n ]ij , when expressed in binary form, satisfy 



7rm = 7r(j) = < . (12) 

' 1 2"" 1 < %, j < 2 n - 1 



In addition, ifn(i) ^ 7r(i') andi < i! numerically theni precedes %' as an index. 
(That is, both even and odd parity indices are given in ascending numerical 
order.) Further, if P n+ \ is the permutation matrix that produces this ordering 
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for 2 n+1 x 2 n+1 matrices then P n +i is obtained from P n as follows: write P n 1 
in terms of two 2 n ~ 1 x 2™ matrices Q and R in the form 



P- 1 = 



q] 



(13) 



Then, 



P- 1 



R 
R 
\0 Q) 



(14) 



The index ordering resulting from application of Lemma 4 will be called parity 
ordering. 

Lemma 5 With parity ordering of indices the matrix T n (I, r) takes the form 

t odd 



T n (I,t) 





( a\ 


< 


[A J 




IB o\ 




10 b) 



(15) 



t even 



PROOF. Since T n (I, r) is symmetric, and odd toggles change the parity of 
a state while even toggles preserve it, T n (J, r) must at least have the form 

t odd 

(16) 

t even 

where A, A T , B, and C are all square matrices of the same size. Let j be 
the binary form of the j'-th even index. A r-point mutation of j has the form 
j + i] where the n digit binary string 77 contains r ones and addition is site- 
wise mod(2). For odd r, j + r\ E while j + i] + a n G where a n 
is the string ... 01 with a single 1 in the n-th position. But j and j + a n 
label corresponding columns of A T and A while j + i] and j + i] + a n label 
corresponding rows of A T and A. Thus A T = A. Likewise, if r is even then j 
and j + a n label corresponding columns of B and C while j + r\ and j + 77 + a n 
label corresponding rows of B and C, showing that C = B. □ 



T n (I,t) 




Theorem 6 If r is oddT n (I,r) is irreducible and bipartite. 
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PROOF. If r is odd there can be no cycles in H n (I,r) having odd period 
since successive mutations will oscillate between E^f 1 and E$°\ Since period 
two cycles are possible, the index of imprimitivity of T n (I,r) is 2 and thus 
T n (I, r) is bipartite and irreducible. □ 

Lemma 7 ([3, p. 74]) Let M be an irreducible non-negative matrix of order 
n with index of imprimitivity k and let m be a positive integer. Then M m is 
irreducible if and only if k and m are relatively prime. 

Since for odd r the index of imprimitivity of T n (I, r) is 2, this implies that if 
r is odd T™{I,t) will be irreducible if and only if m is odd. Note also that 
Theorem 6 implies the graph H n {r) is strongly connected if and only if r is 
odd, while for even r this graph is composed of two disjoint strongly connected 
components with respective vertex sets and E^°\ 



3 State Transition Representations 

The space E n of n-digit binary strings with periodic boundary conditions is 
the state space for all binary valued cellular automata acting on cylinders 
of size n. In Section 2 the structure of the graph H n {r) was studied on this 
space. This structure describes the results of point mutations. The action of 
a CA rule on E n can also be described in terms of a graph. This is usually 
referred to as the state transition diagram, but there are other diagrams that 
also provide information. 



3.1 State Transition Diagrams 

A cellular automaton rule X acting on E n defines a directed graph G(X) 
with vertex set E n and with an edge from ji to p! if and only if X(/i) = 
//. This is the standard state transition diagram (STD) for X on E n . The 
structure and generation of this diagram has been extensively studied [33,36]. 
The STD partitions into disjoint subgraphs, each of which constitutes a basin 
of attraction. Each such basin contains an attractor, either a cycle or fixed 
point, of the CA rule. Figure 1 shows the STD's for the binary difference rule 
(rule 60) [30] and the extensively studied rule 18 [9] on a cylinder of size 6. 

A left-justified fc-site CA rule X is additive if it can be represented in the form 

k— 1 

X = '£a i a i , (17) 

i=0 
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Rule 60 

Y 

6 



o 





Rule 18 





Fig. 1. State transition diagrams for left justified rule 60 and 18 on a cylinder of 
size 6. The numbers accompanying each diagram enumerate the cycle structures. 
The presence of more than one number indicates that several cycles have the same 
structure. 



where a is the left shift and the coefficients, at, take values in {0, 1}. Additive 
rules are particularly well behaved. For example, (17) is equivalent to the 
condition X(fi + //) = X(fi) + X(fj,') for all states \x and //. The state 
consisting of all zeros is a fixed point for all additive rules and the state 1 
consisting of all ones is a fixed point for those rules in which an odd number 
of the coefficients in (17) are not zero. 

Lemma 8 ([22]) Let X be the global transition rule for an additive cellular 
automaton. Then the state transition diagram of X acting on E n consists of 
cycles and fixed points with trees rooted at each state on a cycle and at each 
fixed point. Further, all trees are topologically isomorphic to the tree rooted at 

— * 

the fixed point 0. 



If X is an additive rule then its parity can be denned in terms of the repre- 
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sent at ion (17) by 

n(X) = 



an even number of the Oj are 1 

1 an odd number of the are 1 



(18) 



Site-wise binary addition of an even number of binary strings always yields 
another string with even parity, while addition of an odd number of binary 
strings of even parity yields a string with even parity and addition of an odd 
number of binary strings of odd parity yields a string with odd parity. Thus, 
we have: 

Lemma 9 Let X :E n — > E n be an additive rule. 

(1) Ifn(X) = then X maps both E& and Eg* to E^ . 

(2) Ifn(X) = 1 then X maps E^ to E$ and E$ to E^ . 

An immediate consequence of this lemma, together with (17), is that if X is 
an additive rule with even parity then no state with odd parity can have a 
predecessor in E n . Thus the states in E^ must reside at the top of the trees in 
the STD. Also, if X has odd parity the STD must partition into two distinct 
components with E^ and E$ as their respective vertex sets. In general, 
states without predecessors will be called peripheral and all other states will 
be called internal. 

Proof of the next theorem follows directly from Lemma 9. 

Theorem 10 Let X be an additive rule and let {/i(s)\l < s < p} be a cycle 
of X having period p. Then all states on this cycle have the same parity. 

Lemma 11 Let X :E n — > E n be additive with n(X) = 1 and let n be odd. 
Then the components of the STD of X corresponding to the vertex sets E^ 
and Eft are isomorphic. 

PROOF. Let fi G E^ and set // = 1 + fi. Since n is odd, // G E^ and 
X(fi') = X(l) + X(/jl). Further, X(l) = 1 since it is the sum of an odd 
number of shifts of \. Thus for all /x G E^f> there is an element 1 + /i G E^ 
that maps in an identical manner under X and vice versa. □ 

Note that rule 150 acting on E± provides a counter-example to Lemma 11 in 
the case that n is even. The catch is that for even n both and 1 are in E^\ 

Lemma 12 Let X be an additive rule with ir(X) = 0. If the cylinder size n 
is odd then the maximum tree height h* is equal to 1. 
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(n — l)-st row 





1 


n-th row 





2r 


2r-l 


1 


2r-l 


2r-2 



Table 1 

The number of ones in the i-th column that lie above the final two rows. 

PROOF. With X given by (17) the rule can be represented by the n x n 
circulant matrix X c = circ n (a , . . . , ajt-i)- If n is odd and ir(X) = this is 
a matrix of odd order with each row and column containing the same even 
number, say 2r, of ones. Thus, summation mod(2) of the first n — 1 rows of Xc 
and addition, mod(2), of this sum to the final row must yield a row of zeros. 
But it is not possible, using elementary operations and mod(2) arithmetic, 
to produce another row of zeros in this matrix. To see this, consider the i-th 
column and the (n — l)-st row. The entry in this position will be a or a 1 and 
the i-th entry in the original n-th row will also have been a or a 1. There 
are therefore four cases. The number of ones in the i-th column that lie above 
the final two rows are indicated in Table 1 for each case. 

If a second row of zeros is to be possible, a linear combination of the first n — 2 
rows of Xc must equal the value in the (n — l)-st row for each column. This 
combination must have an odd number of ones in those columns in which the 
(n — l)-st row has a 1, and an even number of ones in the columns in which 
the (n — l)-st row has a 0. Examination of Table 1, however, yields the parity 
of the number of rows required for each block of Table 1: both entries in the 
first column must be even and both in the second column must be odd. This 
indicates that the only case in which the contradiction of requiring both an 
odd and an even number of rows in the sum will not occur is if the (n — l)-st 
row consists of all zeros or all ones. The first case corresponds to the 0-rule 
and the second cannot occur since n is odd while tt(X) = 0. Thus, the nullity 
of the matrix Xc isv—1. By a theorem of Martin, et al. [22], the in degree of 
fixed points or states on a cycle of an additive rule is 2 V . Thus, for the cases in 
question the in degree of is two and since X(0) = and X(l) = this means 

— * — * 

that 1 is the only non-trivial predecessor of 0. Since all trees are topologically 
isomorphic to the tree rooted at the proof is done. □ 



For additive rules with n(X) = 1 the situation is more complicated. If n(X) = 
the minimum possible height for trees is 1 since elements of have no 
predecessors. If ir(X) = 1 this is not the case and the minimum possible 
height is 0. This occurs trivially for X = o k for any k, but other cases for 
which h* = also exist as indicated by the next theorem. 

Theorem 13 ([30]) The maximum tree height for the rule I + a + a 2 (rule 
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150, left justified) is unless n = (mod 3). 

Conjecture 14 Let X be an additive rule and let k(X) denote the number of 
non-zero coefficients in the expression (17). If ir(X) = 1 then h* = unless 
k(X)\u. 

In describing transitions between attractor basins there are several ways to 
represent the set of basins for any given rule. In this paper it is assumed that 
mutations occur at a rate much faster than the time scale of rule iteration. If it 
is assumed that mutations occur on a time scale that is much slower than rule 
iterations (e.g., as is the case in the computation of "metagraphs" in discrete 
dynamics lab [34]) then only the attractors are relevant. 



3.2 The State-Basin Representation 

If B a is an attractor basin for a CA rule X and \x G B a then the height h of 
H above the attractor is if fi is on the attractor and is the minimum integer 
h such that X h ([i) lies on the attractor otherwise. 

If {-B a |0 < a < a} is the set of attractor basins for a CA rule X then E n can 
be partitioned into disjoint classes labelled a:h(a) where a indicates a specific 
attractor and h(a) is a height above the attractor a in the basin B a . Thus, 
each class consists of those states that are at a specified height above a given 
attractor. Conventionally, if is a fixed point the corresponding basin will be 
denoted B . This defines what will be called the state-basin partition of E n . 

Classes in the state-basin partition are taken as the vertices of a graph H n (X, r) 
with an edge connecting two vertices for each r-point mutation that takes a 
state in the state-basin class labelling one vertex to the state-basin class la- 
belling the other vertex. The matrix T n (X,r) is defined as the weighted ad- 
jacency matrix of this graph. If X is additive, T n (X, r) will have size N x N 
with N = ^2 a [h*(a) + 1] where h*(a) is the maximum tree height for basin 
B a - 

Clearly T n (X, r) is symmetric and T n (I, r) is the special case in which X is the 
identity rule. T n (X, r) can be directly obtained from T n (J, r) by summing over 
the rows and columns corresponding to each state-basin class. Unfortunately 
there are no general equivalents to Theorems 1, 3, and 6 that are valid for 
arbitrary rules. 

As with T n (I,r), a probability matrix T n (X,r) is defined by dividing each 
column of T n (X, r) by its sum. The state-basin classes for the binary difference 
rule D, and T n (D, 1) are given in Appendix B for a cylinder of size 6. 
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The value of [T n (X, r)]^- gives the number of distinct paths in H n (X, r) from 
vertex j to vertex % and the characteristic polynomial of T n (X,r) provides 
information on the number of cycles in this graph. 

Lemma 15 ([3, p.76]) Let 0(A) = A„ + ciA™" 1 + c 2 A n " 2 H h c n be the 

characteristic polynomial of a matrix M which is the adjacency matrix of a 
graph G. Then 

(1) The coefficient c r of \ n ~ r is the sum of the determinants of all principle 
minors of M of size r. 

(2) The value of \c r \ equals the number of cycles of G with periods that sum 
to r. 

Lemma 16 ([3, p. 77]) Let M be an r -cyclic n x n matrix with M r having 
block diagonal form 










• 









B 2 


• 


• 


AT = 








B 3 ■ 


• 




v° 





• 


■ B rj 



then there exists a monic polynomial /(A) and non-negative integers p±, . . . ,p r 
such that 



(1) 7(0) / o. 

(2) For all % < r the characteristic polynomial of Bi is X Pi f(X). 

(3) The characteristic polynomial of M is A Pl+ '" +Pr /(A r ). 

Since the index of imprimitivity of the matrix in this lemma is r, it is r- 
periodic and every cycle must have period divisible by r. Thus, the only non- 
zero coefficients in the characteristic polynomial are Ck r for < k < - . On 
this basis, if r is odd and T n (X,r) is bipartite, the characteristic polynomial 
of T n (X, r) will have the form 



0(A) 



A fc + J>A 

i=i 



k-2i 



(19) 



3.3 Shift-Basin and ACS Representations 



Every cellular automata rule commutes with the shift operator. This leads 
to a close relation between the state transition diagram and an equivalent 
diagram defined in terms of shift cycles. This will be called the shift-basin 
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Rule 60 




Fig. 2. Shift-Basin Diagrams for left justified rules 60 and 18 on a cylinder of size 
6. The numbers accompanying each diagram enumerate the cycle structures. The 
presence of more than one number indicates that several cycles have the same struc- 
ture. 

diagram (SBD). The state space E n is first partitioned into shift equivalence 
classes with states \i and // belonging to the same shift class if and only if 
for some r it is true that // = cr r (/i) where [cr(/x)]j = /ij+i. The set of shift 
classes, S(n) = {S r (n)} is taken as a new state space and the rule X:E n — > E n 
induces a mapping X*:S(n) — > S(n) by taking X*(Sj(n)) = Si(n) if, for some 
fi G Sj(n), X(/i) G Si(n). The shift-basin diagram (SBD) is then defined as 
the state transition diagram of the map X*. 

If a cycle of X:E n — > E n is a shift cycle it appears as a fixed point of X*. 
On the other hand, if a cycle of X:E n — > i? n consists of states drawn from 
m distinct shift cycles then this appears as a period m cycle of X*. Figure 
2 shows the SBD's for n = 6 for the binary difference rule and for rule 18. 
These can be compared to the STD's of Figure 1. The result on topological 
isomorphism between trees for additive CA's does not carry over to the shift- 
basin representation, as can be seen from Figure 2. 

The following theorem relating the state-basin and shift-basin diagrams is a 
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version of a result of Jen [17]: 

Theorem 17 For a given n let S = {Si(n)\l < i < m} be an m-cycle of 
X*:S(n) — > S(n) and let r be the smallest integer for which X m (/i) = a~ r {jj) 
when fi lies in a shift class on this cycle. Then (j, lies on a cycle of X :E n — > E n 
having period ms where s is the smallest integer such that rs = (mod n) . 

PROOF. Let {q|1 < % < m} be an m-cycle of X*. Then each q is a shift 
class in E n and since X* m (ci) = Ci for each state /j, e q there is a smallest 
integer r < n such that X m (/i) = <r~ r (/i). Thus, if s is the smallest integer for 
which rs = (mod n) then X ms (/i) = ji and this is not true for X k {ji) for 
any k < ms. □ 

Another form of representation is based on the idea of an autocatalytic set 
(ACS) in a graph. The vertices of a graph are first partitioned into two classes, 
those with in degree 0, and all others. The set of vertices with in degree form 
the periphery of the graph. The set of all graph vertices is then partitioned into 
connected subgraphs and each such subgraph, excluding its peripheral vertices, 
is an ACS. "An autocatalytic set (ACS) is a subgraph, each of whose nodes 
has at least one incoming link from a node belonging to the same subgraph." 
[15]. An ACS basin representation can be constructed from either STD or 
SBD graphs. In either case, the ACS basin representation may differentially 
mix parities from state representation or shift representation categories. 

Each ACS and its corresponding peripheral set now becomes a pair of equiv- 
alence classes, and these classes again can be taken as forming a state space 
which will be denoted either C n or C* depending on whether it is defined 
from the STD or SBD diagram. In either case the rule X defines a map 
X':C n — > C n or X":C* — > C*, and in both of these cases matrices T„(X',t), 
T n (X', r), T n (X", r), and T(X" , r) can be defined. 

The shift-basin and ACS representations are useful since the associated ma- 
trices are smaller than that for the state-basin representation. On the other 
hand, the state-basin representation will be more useful if probability distri- 
butions on attractor basins are used to model discrete potential wells. In this 
paper the main focus will be on the state and shift representations. 



4 Properties of T n (X, r) and T n (X, r) 

A number of characteristics of the matrices T n (X,r) and T n (X,r) can be 
determined, especially if the rule X is additive. For additive rules, all trees are 
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topologically identical to the tree rooted at 0, providing a particularly nice 
link between the form of T n (X,r) and the structure of the state transition 
diagram. 

— * 

Let X be an additive rule with the state having in degree d and maxi- 
mum tree height h*. Further, suppose that all trees are balanced, i.e., that 

— * 

all branchings are topologically identical and following any branch from up- 
ward eventually reaches a height of h*. Then each attractor state has d — 1 
predecessors of height 1 while all internal states not on an attractor have d 
predecessors. Thus, each rooted tree will contain d h * states. Of these, d 11 *^ 1 
will be interior states and d h *~ 1 (d — 1) will be peripheral states. 

To go further, additional assumptions about the number of peripheral states 
will be required. In addition, it is useful to choose the indexing of T n (X, r) so as 
to put this matrix into a simple form. If tt(X) = interior /peripheral indexing 
is chosen, with indices representing classes of even parity that are composed 
of peripheral states placed to the right of indices representing classes of even 
parity that are composed of interior states. If n(X) = 1 parity indexing will 
be used. Note that if n(X) = and all even states have predecessors then 
interior/peripheral and parity indexing are the same. Three simple cases will 
be considered. 

Case 1 : ir(X) = and the set of peripheral states is just the set of odd parity 
states. Since each tree contains d h *~ 1 (d — 1) peripheral states this means that 
the total number of trees for a cylinder of size n is given by 

Qn—l 

(20) 



d h *' l {d- 1) ' 

Each tree is rooted at a fixed point or on a cycle, so (20) is also the number 
of points that are fixed points or lie on cycles of X. Since this number must be 
an integer, there will be constraints on the possible values of n and d. Suppose 
that d = Tm where m is odd and r < n — 1 . Then (20) becomes 

2n-r(/i*-l)-l 



{2 r m-l)m h *- 1 ' 

but m is odd, hence both terms in the denominator of this expression are odd 
and it cannot be an integer unless m and r both equal 1. In that case d = 2 
and the number of trees given by (20) becomes 2 n ~ h * . 

Conjecture 18 The only cellular automata rules satisfying the above con- 
ditions are multiples of the binary difference rule (rule 60) by powers of the 
shift. 

Under these conditions, the number of states at height h in a tree is 2 h ~ 1 
(h > 1) and the total number of states at height h is 2 n+h ~ h *~ 1 . Let N p be the 
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number of attractors of period p, including p = 1 for fixed points. Then 

Y / = P N P = T- h * (21) 
v 

and the total number of classes appearing in the definition ofT n (X,r) is 

(tf + 1) 5>p (22) 



of which Y^ P N p will be peripheral classes. Since in this case interior/periphery 
and parity indexing are identical, T n (X,r) will have the form 



T n (X,r) = { 



A 
A T 
B o" 

c. 



t odd 



t even 



(23) 



where the sizes of the matrices A, B, and C are respectively (h-Y^pN^J x 
(EpNp), (h-YpNp) x (h-EpNp), and (E p N p ) x (E p N p ). 

Case 2 : n(X) = and the set of peripheral states contains and half of 
the states in Under these assumptions, with d = 2 r m, (20) becomes 



g _ 2 n— r(h* — 1)— 2 



(2 r m — l)m 



h*-l ' 



(24) 



and this will be an integer if and only if m = 1 and r = 2. In this case d — 4 
and the total number of trees will be 2 n ~ 2h * . The well-known rule 90 is included 
under this case. 

The number of states at height h in each tree will be 4 h_1 3 = 2 2 ^ _1 )3 and the 
equivalent of (21) is 

J2=PN P = r~ 2h * . (25) 
p 

In this case interior/periphery indexing differs from parity indexing. If all 
peripheral classes contain only states of the same parity then use of parity 
indexing puts T n (X,r) into the form of (23). In general, however, peripheral 
classes will be of mixed parity and interior /peripheral indexing gives the form 
ofT n (X,r) as" 

7o A' 

T n {X,r) = \) A ^J , (26) 

r even 

, C 



r odd 
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where A, D, and B have respective sizes (h-J2 p N p ) x \J2 p N p ), \J2 p N p J x 

(j2 P N P ^J , and (h ■ J2 P N P ^J x (h ' H P ^p) , while G is the same size as A and C 
the same size as D. Note, however, that despite the form shown, the matrix 
T n (X,r) is always symmetric. 

Case 3 : n(X) = 1. By Lemma 11, ifn is odd then T n (X, r) in parity indexing 
must have the form 

r odd 

(27) 

t even 

where A, B, and C are all square matrices of size (h* + l)/2 • J2 p N p . Lemma 
11 also implies that N p is even. 

If n is even then T n (X,r) has the form of (23) but the size of the matri- 
ces A and B is now ({h* + 1) E P ^ e) ) x ({h* + 1) EpN^), while C is size 

(0* + 1)E P N(° } ) x ((/i* + l)E p N^ o) ). Here and are the number 
of attractors of period p with even or odd parity. 

By Theorem 10, all states on an attractor of an additive rule will have the 
same parity. Thus, each of the classes a:0 for < a < N p — 1 will consist of 
states having the same parity. This is not generally true for non-additive rules, 
nor is it true in general for classes a:h for h > 1. Thus, with appropriately 
chosen indexing, the matrix T n (X,r) for an additive rule can always be at 
least put into the form of (26), although the sizes of the submatrices involved 
may vary. 

What is more interesting, however, is the probability matrix T n (X, r) for 
which the form of (26) becomes 



T n (X,r) = 



Lemma 19 Let T be any matrix with the form (28) with A having size nxm 
and C having size m x m. Then the characteristic equation for T is obtained 
from 

\\n-m\ l X 2j _xc - BA\ = 0, r odd 

(29) 

\XI - D\ • \XI - M\ = 0, r even 



T n (X,r) = 





f0 A\ 




y 4 




(B 0\ 








t odd 

(28) 

r even 
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In some cases the term in (29) with r odd can be factored. 

For the matrices T n (I, r) and T n (I, r) completely general results are available. 

Theorem 20 The eigenvalues ofT n (I,r) and T n (7 , r) are, respectively, the 
eigenvalues of the matrices T„_ 1 (7, r) ± T„_ 1 (J, r — 1) and (n — r)/(n — 1) • 
T n _ 1 (7,r)±r/n-T n _ 1 (7,T-l). 



PROOF. By (4) the characteristic equation |A7 — T n (X, r)| = can be writ- 
ten as 

' A/-T n _i(X,r) -T n _!(X,r-l) 
-T n _!(X,r-l) A/-T n _!(X,r) 



0, 



which becomes 

A 2 / - 2AT n _ 1 (7, r) + ^^(7, r) - ^(7, r - 1) 



0, 



or 



|A7- [2T„_ 1 (7,r) + T n _ 1 (7,r- 1)]|-|A7- [2T„_ 1 (7,r)+T n _ 1 (7,r-l)]| = 



The result for T n (7,r) follows from similar calculations based on (5). □ 

Corollary 21 The eigenvalues A(n+1) and A(n+1) ofT n+1 (I,r) andT n+1 (I,r) 
are given in terms of the eigenvalues X(n) and X(n) ofT n (I,r) and T n (7,r) 
by 

X(n + 1) = X(n) ± 1 
1 ; K J . (30) 

A(n + 1)= ^ 

For the matrices T n (X, r) and T n (X, r) the case is more complex, and only a 
few results are available. 

Theorem 22 Let D represent the global operator for elementary rule 60 and 
let n = 2 k . Then T n (D, 1) has the form 



TAD A) 



where A is a 2 k x 1 column with entries 



h — 



A 

1 



A h = 1 2 :t , (31) 

S ^r, 1 < h < 2 k - 1 V } 
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and the 1 in the lower left block represents a row consisting of 2 ones. The 
characteristic equation ofT n (D, 1) is 

A 2 ^ 1 (A 2 - l) = . (32) 



PROOF. The STD for rule 60 with n = 2 k consists of a single tree rooted at 
with maximum height h* = 2 k . Further, by Lemma 9, all of the odd parity 
states reside at the top of this tree. Thus, there is a single odd class, labelled 
0:2 k and there are 2 k even parity classes labelled 0:h for < h < 2 k — 1. 
Further, for h > 1 each of the classes 0:h contains 2 h — 1 states while the 
class 0:0 contains only a single member. Each member of the interior classes 
is even and hence with r = 1 mutates to an odd state in the class 0:2 fe a total 
of n = 2 k times (one mutation for each digit in the state). Thus the (0:2 fc ,0:/i) 
entry of the matrix T 2k (D,l) is 2 k+h ~ 1 while the (0:h',0:h) entries will be 
for < h, h! < 2 k . The (0:2 fc ,0:2 fc ) entry is 0, and since the matrix must be 
symmetric, the (0:/i,0:2 fc ) entries are also 2 k+h ~ 1 . Hence the column sum of 
the final column of the matrix is 



2 k + 



E2 



k+h-l 



= 2 k 



h=i 



2 K -1 



1+E2 



h-1 



h=l 



Dividing each element of the final column by this sum yields the form given 
in (31) while division of each element of the final row of the matrix by the 
corresponding column sum yields 1. The characteristic equation then follows 
immediately as an application of Lemma 19. □ 



Another result following from Lemma 19 generalizes this theorem: 

Theorem 23 Let A be an m x m probability matrix and let {pi\l < i < 
r , < Pi < 1} be a set of non-negative numbers with Yn=iPi ^ 1- Define an 
(r + l)m x (r + l)m probability matrix T by 



T 



(q 





yA A 



\ 



PiA 

Pr-A 

1 t v, ) 



(33) 
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Then the characteristic equation ofT is obtained from 



A (r-l)m | A/ _ M 



(34) 



An example of Theorem 23 is given by rule 90 on a cylinder of size 6. Taking 
5 as the global operator for rule 90, the matrix T(8, 1) is 



\A 



(35) 



with 



/oo 





A 



i i 

6 6 



1 1 

2 6 



1 1 

3 6 



1 1 

6 3 





1 1 

3 3 

1 1 



1 1 

6 6 



l - i 

6 u 3 



(36) 



l°HH°/ 



The characteristic polynomial of T 6 (5, 1) is obtained from \XI— A\-\XI+A/3\ = 
0. 

Appendix A lists the characteristic polynomials of T n (X, 1) for a number of 
additive and non-additive rules for varying cylinder sizes. 

The matrix T n (X, r) has an immediate interpretation as the weighted ad- 
jacency matrix of the graph H n (X, r) with vertices labelled by the classes 
a:h(a) and the edges between any pair of vertices weighted by the number 
of r-point mutations that take elements from the class labelling one vertex 
to that labelling the other and vice versa. Thus the (a:h(a), f3:h(/3)) element 
of T n (X, t) is the probability that a r-point mutation of the class (3:h(f3) will 
be in the class a:h(a). While T n (X,T) is necessarily symmetric, this is not in 
general true for T n (X, t). 

An intuitive understanding of the spectra of both T n (X,r) and T n (X,r) is 
obtained by relating the eigenvalues of these matrices to the solvability con- 
ditions for a system of linear equations connected with the graph H n (X,r). 
Label each vertex of this graph by a variable x^. Then the question of finding 
values i/i, not all zero, of these variables such that each is proportional, 
with the same constant of proportionality, to the sum of all values i/j such 
that there is an edge connecting vertex j to vertex i is equivalent to solving 
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the homogeneous system of equations 

= [T n (X, t)]^ xj or equivalently Xx = T n (X, t)x , (37) 

j-n 

hence the possible proportionality factors A are just the eigenvalues of T n (X, r). 

If, instead, each value yi is required to be proportional to the mean value of all 
Uj such that there is an edge connecting vertex j to vertex % the corresponding 
set of linear equations to be satisfied becomes 

Xx < = J £ l T n(X, T)] tj xj or Xx = T n (X, r)ZT ^ , (38) 

where di is the in degree of vertex i, namely 

d t = ]T [Tn(X, r)] . . and D = diag(^) . (39) 
j 



With these definitions it is clear that T n (X,r) = T n (X, r)D _1 so that the 
eigenvalues of T n (X,r) are the possible proportionality factors for which the 
numerical values of each xi are proportional to the mean of all Xj values at 
vertices j having an edge connecting them to vertex %. If the characteristic 
polynomials of T n (X, r) and T n (X, r) are written respectively as P(X) = X n + 

aiA n_1 H h a n and Q(X) = X n + giA™" 1 + h q n . Then the coefficients of 

these polynomials are [5]: 



Oi= E (-1) P(L) 

LeAi 
i=l,...,n 

„ - v ( 1 \ p{u) 2C . (U) 

hev(u) 



(40) 



where Aj is the set of linear directed subgraphs on i vertices, p(L) is the number 
of cycles in the linear directed subgraph L, Yi is the set of basic figures of size 
i, p(U) is the number of components in U, c{U) is the number of circuits in 
U, and V(U) is the vertex set of U. 

Since T n (X, r) is a probability matrix it has maximum eigenvalue 1, corre- 
sponding to the case in which the numerical values of each variable Xi in (38) 
is equal to the mean of all values Xj such that there is an edge connecting 
vertex j to vertex i. 

Theorem 24 Let the matrix T n (X,r) be defined in state, shift, or ACS rep- 
resentation. Let {p(r) G [0,1] |0 < t < k} be a set of non-negative numbers 
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such that J2t=iP( t ) = 1 an d define a probability matrix T n (X,p(r)) by 

k 

T n (X,p(r)) = J2p(r)T n (X,r) . (41) 

T=0 

Then the steady state vector of the Markov process with transition matrix 
T n (X,p(T)) is the uniform distribution in which the probability of each class 
equals the number of states in that class divided by 2 n , the total number of 
states in E n . 

Remark 25 von Nimwegen, et al. [29] prove a similar result for the case 
p(l) = l,r = l. 



PROOF. Let m be the number of equivalence classes used in the definition 
of the matrices T n (X, r) and let v be the m-dimensional vector corresponding 
to a uniform distribution over these classes. Then 



T n (X,T)-v]. = J2[ T n(X,r) 



(42) 



Since [T n (X, r)]^ is the probability that a r-point mutation from the j-th class 
will be in the i-th class, while Vj is the fraction of the total number of states 
that are contained in the j-th class, the sum in (42) is the probability that a 
state chosen at random from E n lies in the i-th class after a r-point mutation. 
The number of possible mutations of states in E n is 2 n (^j while the number 

of possible mutations from the i-th class to E n is n i{^) where is the number 
of states in this class. But all mutations are reversible, hence this last number 
is also the number of r-point mutations from E n to the i-th class. Hence the 
probability that a randomly chosen state of E n will mutate to a state in the 
i-th class is 

— )-v — — — v i ■ (43) 
Thus (42) becomes T n (X, t)v = v. □ 



5 Relations Between State-Basin and Shift-Basin Representations 



Although the shift-basin matrix T n (X*,r) was introduced in Section 3, atten- 
tion so far has focused on the state-basin transition matrix T n (X,r). In this 
section the relation between these two matrices is explored. 
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The equivalence classes a:h(a) used to define T n (X, r) are sets of states at 
the same height h(a) above the attractor a in the state transition diagram 
of the rule X. The classes used in the definition of T n (X*, r) are sets of shift 
cycles with each shift cycle in the set being at the same height h(a) above the 
corresponding attractor a in the state-basin diagram. Since cellular automata 
rules commute with the shift, the height of a state in the STD is the same 
height as the shift cycle to which that state belongs in the SBD. Thus a start 
at understanding the relation between the matrices T n (X,r) and T n (X*,r) 
can be found in the relation between the STD and the SBD for the rule X. 
This relation is most easily explored for additive rules. In that case, Lemma 
8, Theorem 10, and Lemma 12 are available as characterizations of the STD, 
while Theorem 17 gives a specific connection between this diagram and the 
shift-basin diagram. 

If each of the classes a:h(a) in the STD consists of a single shift cycle, then 
the corresponding transition matrices T n (X,r) and T n (X*,r) are equal. This 
will be true, for example, for X = a k for any value of k. In general, however, 
either at least some of the classes a:h(a) will be composed of a union of more 
than one shift cycle, or will consist of a union of subsets of several shift cycles. 
The binary difference rule on a cylinder of size 5 is an example of the first 
case — in addition to the fixed point there is a single attractor which is a 
period 15 cycle composed of the union of the three shift cycles {<7 r (00011)}, 
{cr r (00101)}, and {<7 r (01111)} for < r < 4. The same rule on a cylinder of 
size 6 gives an example of the second case: in addition to the fixed point and 
a period 3 cycle that is a shift cycle, there are two period 6 cycles the first 
consisting of the union of the sets {<r 2r (000101)} and {<r 2r (001111)}, and the 
second of the union of the sets {a 2r (001010)} and {(x 2r (011110)} for < r < 2. 

If the classes a:h(a) are composed of a union of several full shift cycles then 
again T n (X, r) = T n (X*,r) since the shift classes used in defining T n (X*,r) 
contain exactly the same states as the corresponding classes a:h(a). Thus, 
only the case in which the classes a:h(a) are composed of the union of proper 
subsets of shift cycles needs to be considered. When this is the case, each shift 
cycle contributes equally if the rule is additive. 

Lemma 26 Let X be an additive rule defined on a cylinder of size n such that 
the equivalence classes a:h(a) are composed of subsets of two or more shift 
cycles. Then, for fixed height h, each class a:h(a) contains an equal number 
of elements from each of these shift cycles. 

For this final case, elementary row and column operations can be used to 
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reduce the matrix T n (X, r) to the form 



T n (X,r) 



( 



T n (X%r) 



B 



C 



) 



(44) 



where C is a square matrix. The characteristic equation for this matrix now 
becomes 



so that the eigenvalues of T n (X, r) consist of the eigenvalues of T n (X*, r) and 
those of the matrix C. In many cases the latter eigenvalues will either be all 
0, or will be the same as some of the eigenvalues of T n (X*, r). 

Algorithm 1 Let c(Ej) be the set of classes a.i:h{a.i), for fixed height h, that 
are composed of states drawn from the shift cycles SV,- in the set Ej. 

(1) Put T n (X, r) into the form of (28) 

(2) The classes a!j.7i(aij) label rows and columns of T n (X,r) . Let a :h(a ) be 
the label of the first row ofT n (X, r) corresponding to an element o/c(Ej) 
and add to this row the remaining rows labelled by members of c(Ej). 
Then move these remaining rows to the bottom of the matrix. 

(3) Subtract the column labelled by a :h(ao) from each of the columns labelled 
by the remaining elements o/c(Ej) ; then move these columns to the far 
right of the matrix. 

(4) Carry out steps (2) and (3) for each value of the height h and for each 
set of shift cycles Ej. 

Examples of this algorithm are given in Appendix B. 

Theorem 27 Let X be a cellular automata rule defined on a cylinder of size n 
such that the STD of X contains attractor basins in which there are equivalence 
classes a:h(a) composed of the union of proper subsets of two or more shift 
cycles. Then Algorithm 1 will put the matrix T n (X,r) into the form of (44). 

PROOF. All that is required is to show that the operations in step (3) of 
this algorithm will in fact produce the block of zeros in the matrix of (44). 
The remainder of the blocks in this matrix require no explanation: T n (X*, r) 
arises from the construction set out in the algorithm and the forms of the 
matrices B and C are not directly specified. 

The block of zeros arises when a column labelled a:h(a) is subtracted from 
another column labelled /3:h(/3) in the case where both equivalence classes so 
labelled share subsets from the same set of shift cycles. But this means that 
if fj, e a:h(a) and 7 G /3:h(/3) are drawn from different subsets of the same 
shift cycle then there is some fixed k such that 7 = o" fc (/i). Thus, both of these 



XI -C\- XI-T n (X*,r) =0, 



(45) 
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states toggle in identical ways, up to a shift, and so will be identical in row 
positions corresponding to equivalence classes consisting of the union of sets 
of full shift cycles. But by construction, these are precisely the classes that 
label the rows corresponding to the matrix T n (X*,r). This means that the 
only possible non-zero contributions will appear in the matrix C. □ 

As a result of this theorem we have another immediate result arising from the 
case in which the matrix C is 0: 

Theorem 28 Let X be a cellular automata rule on a cylinder of size n such 
that the matrix T n (X,r) has the form of (44) with C = 0, and let t be odd. 
Then if (jf{\) = is the characteristic equation of T n (X* , r) , the correspond- 
ing characteristic equation of T n (X,r) will be 0(A) = A fe 0(A) = for some 
k. 

In general, if <f>*(\) = and 0(A) = are the characteristic equations of 
T n (X*,r) and T n (X,r) respectively, then 0(A) = /(A)0*(A). The conditions 
under which /(A) has roots that coincide with roots of 0*(A) are uncertain. 
This is true for the two additive rules described in Appendix B, but not for the 
third, non-additive rule described there. No cases of additive rules for which 
T n (X,r) and T n (X, r) have different eigenvalues have been found. 

Conjecture 29 Let X be an additive rule. ThenT n (X,r) andT n (X*,r) have 
the same eigenvalues although the multiplicities of some eigenvalues will differ. 



6 Discussion 

A number of results have been presented on the use of transition matrices to 
describe mutational transitions between cellular automata attractor basins, 
but many questions remain open. Appendix A, for example, lists the charac- 
teristic equation for the matrix T n (X, r) for a number of additive and non- 
additive rules. Inspection shows that the eigenvalues of this matrix for the 
additive rules considered are simple fractions with denominators related to 
cycle periods. This is not the case for the non-additive rules. Resolution of the 
questions of whether or not this is generally true is of great interest. It might 
be conjectured, for example, that the eigenvalues of T n (X, r) for additive rules 
are always fractions in which the denominator is equal to a cycle period, or to 
an integer factor of a cycle period. 

Another point of interest arises concerning the conjecture at the end of Section 
5. Proof of this conjecture, or determination of the necessary and sufficient 
conditions for the matrices T n (X,r) and T n (X*,r) to have the same eigen- 
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values would be valuable in understanding the connection between rule state 
transition diagrams, their shift equivalents, and the toggle relation which is 
defined on the n-hypercube. Further analysis of the relation between the struc- 
ture of a rule STD and the matrix T n (X, r) could also shed more light on these 
connections. 

In the case of additive rules, such analysis might prove relatively simple since 
a r-point mutation is equivalent to the transformation /x — > \i + r\ where 77 is 
a state containing r ones located at the toggle sites. For example, the next 
theorem, on the preservation of vertex structure in state transition diagrams, 
follows immediately: 

Theorem 30 Let X be an additive cellular automata rule defined on a cylin- 
der of size n. Let /1 and // be predecessor states of a state 7, and let £ and £' 
be t -point mutations of \i and [/,': £ = /1 + r\, £' = // + 77. Then: (1) Both £ 
and £' are predecessors of the state 7 + X{rj); (2) Both /1 and 1 + /1 mutate to 
predecessors of the same state 7 + X(rj). 

Analysis of special cases can be useful as well. For example, the binary differ- 
ence rule has operator form D = I + a. Thus, D(fi) — fj, + cr(/x) = D(l + //). 
Making use of these relations, it is easy to show that if \x is at height h(/j,) in 
the STD of this rule while Z) /l ^ _1 (/i) = 7, then taking t T as the operation of 
making a r-point mutation, 

h(n)-2 

t T ( 1 )=tM + a £ D'bi) . (46) 

r=0 

On a more general note, a number of extensions of the work reported in 
this paper may be possible. Here the transitions between attractor basins 
were accomplished by point mutations. Another possibility is to specify a 
probability matrix a priori. This could be used, for example, to model potential 
wells with probability by allowing transition probabilities depend on the height 
of a state above the attractor. Another line of work would be to consider non- 
cylindrical cellular automata strings of length mn with a random "heat bath" 
at the right end and develop models of fluctuation enhancement processes. 
Work along both of these directions is currently in progress. 
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A Examples of Characteristic Equations of T n (X, 1) 



Below, characteristic equations are given only for cases in which the size of 
T n (X, 1) is sufficiently small. Also, the third column, t M , gives the maximum 
period. 
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A.l T n (D, 1) for 2 < n < 13 (rule 60j 



n 


Characteristic Equation 


tM 


1 


A 2 -l 


1 


2 


A(A 2 -1) 


1 


3 


(A 2 " 1) (A 2 - §) 


3 


4 


A 3 (A 2 - 1) 


1 


5 


(A 2 " 1) (A 2 - ^) 


15 


6 


a 8 (a 2 -i)(a 2 -|) 


6 


7 


(A 2 -l)(A 2 -i) 2 (A 2 -^) 7 


7 


8 


A 7 (A 2 - 1) 


1 


9 


( A l ) ( A 9) ( A 3,969) ( A 3,969) ( A 3,959) 


63 


10 


A10 ( A2 "1) ( A2 -^) 7 ( A2 -^) 2 


30 


11 


( A l ) ( A 961 ) ( A 116,281 ) 


341 


12 


A 84 (A 2 - 1) [A 2 - |) (A 2 -^) 4 (A 2 -^) 12 


12 


13 


( A l ) ( A 74,529) ( A 3,959) 


819 



A. 2 T n (5, 1) for 4 < n even < 8 (rule 90: n odd cases same as rule 60) 



n 


Characteristic Equation 


tM 


4 


A(A-l) (A + i) 


1 


6 


( a -i)( a2 -|) 2 ( a2 -^t) 2 ( a + I) 2 ( a -|) 


3 


8 


A 3 (A-1) (A + I) 


1 
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A. 3 T n (A, 1) for 3 < n < 11 (n = 8, 10 excluded) (rule 150) 



n 


Characteristic Equation 




3 


(A 2 " 1) (A 2 - |) 


1 


4 


A 4 (A 2 - 


-1) 


4 


5 


(A 2 - 1) (A 


2 L.) 

225 J 


15 


6 


A 5 (A 2 -1) 




2 


7 


(A 2 " 1) (A 2 - 


(A 2,401 ) 


7 


9 


(A 2 - 1) (A 2 - |) (A 2 - 


1 U\2 1 A 
3,969 7 V /l 35,721 / 


63 


11 


(A 2 - 1) (A 2 - 


2 f\2 441 \ 

v 116,281 y 


341 



A. 4 T„(18,l) for 3 < n < 7 (rule 18) 





n 


Characteristic Equation 


tu 




3 


(A-l) (A 2 + |A-|) 


1 




4 


(A-l) (A 3 + § A 2 - ±A - i) 


4 




5 


(A !) (A + n A 1100 A 1100 ) 


10 




6 


c\ n f\ 2 ^ f\ 5 i 2A4 131A3 5a 2 , a , i \ 

/ V 8/ V 9 324 162 1 72 1 5,832/ 


3 




7 


C \ 1) (\ 5 457A4 1.013A 3 , 769A 2 | 1,712A 128 \ 
y > \ 1,827 5,481 1 38,367 1 268,569 626,661 J 


1 




r n (22, 1) /or 3 < n < 6 (rule 22) 




n 


Characteristic Equation 


tu 


3 


(A-l)(A 2 + iA-|) 


1 


4 


A(A-l)(A3 + fA 2 -iA-!0 


4 


5 


(A-1)(A 2 -^) (A 4 + ¥-*-W + T§5) 


1 


6 


(\ '\ ) (\ 6 \ a5 117a4 i 59A3 i 1 ' 009A2 7A 1 ^ 

V ^ V 10 324 1 6,480 1 38,880 2,592 19,440/ 


2 
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A. 6 T n (30, 1) for 3 < n < 7 (rule 30) 



n 


Characteristic Equation 




3 


(A 2 - 1) (A 2 - §) 


1 


4 


A 3 (A 2 -1) 


8 


5 


(A 2 - 1) (A 2 - ^) (A 2 -^) 2 


5 


6 


A 2 (A-1) (A 2 -I). 

( \7 , \ 6 7A 5 117A 4 7A 3 , 7A 2 , A 1 \ 
\ 48 432 1,296 1 432 1 3,888 3,888 J 


2 


7 


\2(\ -\\ ( \2 ^(X 5 1 31A4 103A3 23X2 I 3A 1 229 ^| 
v / V 9/ V 63 882 686 1 2,744 1 1,210,104/ 


63 



A. 7 T„(54, 1) for 3 < n < 6 (We 54 j 



n 


Characteristic Equation 


t&4 


3 


(A 2 " 1) (A 2 - |) 


1 


4 


A 6 (A 2 -1) 


4 


5 


A(A 2 -1)(A 2 -^) (A 2 -I) 


1 


6 


\2/\ i\ /\2 A 1 W\5 , 3A 4 7A 3 341A 2 53A , 5 \ 
v ' \ 4 16; V 4 240 6,480 38,880 1 10,976 J 


12 



A. 5 T n (110,l) /or 3 < n < 6 (We 110) 



n 


Characteristic Equation 


tM 


3 


(A 2 " 1) (A 2 - |) 


1 


4 


A 6 (A 2 - 1) (A 2 - i) 


4 


5 


A 2 (A-1) (A 3 + ¥-i-2l) 


1 


6 


( \ 1) ( \ 7 \ 7x6 \ 67x5 7x4 19A3 i 61x2 i 7A 1 ^1 

l A / y 9 1,620 120 4,320 1 69,984 1 349,920 349,920/ 


18 
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B Examples of Algorithm 1 for r = 1 



Example 31 (Binary difference rule on a cylinder of size 6) Let D be 

the global operator for the binary difference rule. On a cylinder of size 6 the 
equivalence classes of states are: 



0:0 


= {000000} 






0:1 


= {111111} 






0:2 


= {010101, 101010} 






1:0 


= {(j r (oooioi) ; (x r (oomi) 


< 


r < 2} 


1:1 


= K'(000011) ; a r (111010) 


< 


r < 2} 


1:2 


= K'(oooooi) ; (x r (miio) 


o < 


r < 2} 


2:0 


= K'(ooioio) ; (x r (omio) 


o < 


r < 2} 


2:1 


= {a r (000110) ; a r (110101) 


o < 


r < 2} 


2:2 


= {(T r (000010) ; a r (111101) 


o < 


r < 2} 


3:0 


= K'(011011)|0 < r < 2} 






3:1 


= {a r (001001)|0 < r < 2} 






3:2 


= {(T r (111000)|0 < r < 2} 
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The equivalence classes sharing states from the same shift cycles are (1:0,2:0), 
(1:1,2:1), and (1:2,2:2). In the form of (28) the matrix T n (D,l) is 
















1 


1 















24 


24 












n 

U 


1 


1 


u 












24 


24 












i 


1 


1 


1 












4 


6 


6 


4 












1 


1 


1 


1 













4 


6 


6 


4 












1 


1 


1 


1 












4 


6 


6 


4 












1 


1 


1 


1 












4 


6 


6 


4 















1 

8 


1 

8 


















1 

8 


1 

8 








1 


1 1 


1 













12 


12 12 


12 










1 1 


1 


1 1 


1 


1 1 










2 2 


3 


3 3 


3 


2 2 











1 1 


1 


1 1 


1 


1 1 










2 2 


3 


3 3 


3 


2 2 













1 

4 


1 1 

4 4 


1 

4 














The indices for both rows and columns are ordered 0:0, 0:1, 1:0, 1:1, 2:0, 2:1, 
3:0, 3:1, 0:2, 1:2, 2:2, 3:2. Add the row labelled 2:0 to that labelled 1:0, the row 
labelled 2:1 to that labelled 1:1, and the row labelled 2:2 to that labelled 1:2. 
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Then move rows 2:0, 



2:1, and 2:2 to the bottom of the matrix. This yields 



± i 

w 24 24 
w 24 24 



1 1 

2 3 

1 1 

2 3 



00 — — — — 00 

12 12 12 12 

1 1 2 2 2 2 j j 

3 3 3 3 

0- - - - 00 

u u 4 4 4 4 U U 



111 1 1 111 
\223 3 3 322 



1 1 



1 1 

4 6 



1 1 

3 2 



1 1 

6 4 



1111 

4 6 6 4 



Now subtract the column labelled I/O /rom column 2:0, the column labelled 1:1 
from column 2:1, and the column labelled 1:2 from column 2:2. Following this, 
shift columns 2:0, 2:1, and 2:2 to the far right of the matrix. The result is 



( 



0^0000 
0^0000 

Hi 000 







111 

2 3 2 

o I 
\ 






1 

12 


1 

12 








1 1 


2 


2 


1 


1 


3 


3 









1 

4 


1 

4 






































I 1 1 

V 2 2 


1 


1 


1 


1 


3 


3 


2 


2 



4 6 4 

111 
4 6 4 



1 1 1 

0) 

which is in the form of (44) with the upper 9x9 block the matrix T n (X*, 1). 
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Example 32 (Rule 90 on a cylinder of size 6) Let S represent the global 
operator for rule 90. On a cylinder of size 6 the equivalence classes for T n (5, 1) 
are 



0:0 ^ 


- iooooool 

1 \J \J \J \J \J \J I 




0:1 = 


- {111111, 010101, 101010} 




i n 

1 .u 






1:1 = 


= IWOOlOOlllO < r < 21 U io- r f000111 N )l0 < r 


< 51 


2:0 - 


= 1000101 010001 0101001 




2-1 


- irr 2 ('000001 )\0 < r < 21 I 1 i(T 2 n 1 1 1 10)\0 < r 


< 21 




U{(T 2 (101011)|0 < r < 2} 




3:0 = 


= {001010, 100010, 101000} 




3:1 - 


= {a 2 (000010)|0 < r < 2} U {a 2 (111101)|0 < r 


<2} 




U{(J 2 (010111)|0 < r < 2} 




4:0 = 


= {111100, 001111, 110011} 




4:1 = 


= {tx 2 (000011)|0 < r < 2} U {(j 2 (100110)|0 < r 


<2} 




U{(T 2 (011001)|0 < r < 2} 




5:0 - 


= {111001, 100111, 011110} 




5:1 - 


= {(J 2 (000110)|0 < r < 2} U {<r 2 (001101)|0 < r 


<2} 




U{(J 2 (110010)|0 < r < 2} 





Classes sharing states from the same shift cycles are: (2:0,3:0), (2:1,3:1), 
(4:0,5:0), and (4:1,5:1). The matrix T n (S,l) with interior/peripheral index- 
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ing in the order 0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 0:1, 1:1, 2:1, 3:1, 4:1, 5:1 is 



( 










J_ 

18 


J_ 

18 


















J_ 

18 


J_ 

18 


1 

9 


1 

9 






1 


J_ 


1 





J_ 


J_ 







6 


18 


9 


18 


18 






1 


1 


o 


1 


1 


1 






6 


18 


9 


18 


18 






o 


1 

9 


1 

18 


1 

18 


o 


1 

9 






o 


1 

9 


1 

18 


1 

18 


1 
9 


o 





1 1 

6 6 





o 


1 
9 


1 

9 


o 


o 





1 1 

6 6 


no 

3 3 u 





1 
9 


1 
9 


2 
9 


2 
9 


1 1 


|o 


111 


1 


2 





1 


1 


2 6 


6 6 3 


9 


9 


9 


9 


of 


1 1 

6 6 


Of 


2 
9 


1 

9 


1 

9 





2 
9 


v°§ 


1 1 

6 6 


§00 


2 
9 


1 

9 


1 

9 


2 
9 






Carrying out the operations indicated: adding the row labelled by 3:0 to that 
labelled by 2:0; by 3:1 to that labelled by 2:1; by 5:0 to that labelled by 4:0; and 
by 5:1 to that labelled by 4:1 then moving rows 3:0, 3:1, 5:0, and 5:1 to the 
bottom of the matrix; then subtracting column 2:0 from 3:0, 2:1 from 3:1, 4:0 
from 5:0, and 4:1 from 5:1 followed by moving columns 3:0, 3:1, 5:0, and 5:1 
to the far right of the matrix yields the form 



( 









1 

18 





o 





1 

18 


1 

9 


1 


1 


1 


1 


3 


9 


9 


9 





2 
9 


1 

9 


1 

9 


\ 





1 

9 





0\\0 

6 3 





1 

9 


2 
9 


1112 


2 


2 


2 


3 3 3 3 


9 


9 


9 


2 I 1 

3 3 3 u 


4 


2 


2 


9 


9 


9 


1 

6 


1 

18 





1 

18 





1 

9 


1 

18 


1 

9 


i - " 

6 6 3 


1 
9 





1 

9 


I I I o 

3 6 3 w 


2 


1 


2 


9 


9 


9 



A i 









-i 
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Rule 54 





Fig. B.l. State transition diagram for left justified rule 54 on a cylinder of size 6. The 
numbers accompanying each diagram enumerate the cycle structures. The presence 
of more than one number indicates that several cycles have the same structure. 



Again this has the form of (44) . For the binary difference rule 



XI-T 6 (5,1) = A 3 A/-T 6 (5*,l) 



For rule 90 the matrix C is not zero and 



XI - T 6 (S, 1)| = (A 2 - l) (A 2 - 1) \\I - T e (5\ 1) 



In both of these cases, computation of eigenvalues shows that the state-basin 
and shift-basin matrices have the same eigenvalues. The next example demon- 
strates that this is not always the case. 



Example 33 (Rule 54 on a Cylinder of Size 6) Figure B.l shows the state 
transition diagram for rule 54 on a cylinder of size 6. 



The equivalence classes drawn from this STD are: 
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0:0 


= {000000} 


0:1 


= {111111, 010101, 101010} 


0:2 


= {001001, 010010, 100100} 


0:3 


= {<x r (000011)|0 < r < 5} 


0:4 


= {<j r (001111)|0 < r < 5} 

I V /I — — J 


0:5 


= {a r (101100)}U {(x r (100110)} 0<r <5 


1:0 


= {a 2r (000001)} U {a 2r (000111)} U {a 2r (010001)} U {<7 2r (011111)} 
< r < 2 


1:1 


= {a 2r (011101)} < r < 2 


2:0 


= {a 2r (000010)} U {a 2r (001110)} U {<r 2r (100010)} U {(j 2r (111110)} 
< r < 2 


2:1 


= {<7 2r (111010)} < r < 2 



Taking the row and column indexing in the order: 0:0, 0:1, 0:3, 0:4, 1:1, 2:1, 
0:2, 0:5, 1:0, 2:0, the matrix T n (54, 1) has the form 



O^i 

U U 24 24 

0^^ 

Q u u 24 24 

u 3 24 24 

- - - 

u 6 6 6 

^0 i ± I 

6 6 6 2 2 

9333512 12 

2 2 18 3 2 U 5 12 4 U 

1 I i 1 - - - (1 1 

22 18 3 25 12 4/ 



T/ie equivalence classes sharing shift cycles are (1:0, 2:0) and (1:1, 2:1). Ap- 
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plication of Algorithm 1 yields the matrix 



( 



i 

i 

" 3 24 

n 1 1 



1 
5 

\ 

D 

4 1 1 2 

1 1 



5 2 12 
9 3 2 5 



1 1 

6 8 

1 _1_ 

6 12 

h 

1 1 

6 4 



00 o ooii 0| 



- - — - - 

\ 2 2 18 3 5 



1 n 1 1 

12 U 2 4/ 



The 8x8 matrix in the upper left is the matrix T n (54*, 1) having the charac- 
teristic equation 

0*(A) = A 2 (A - 1) ( A 5 + -A 4 - —A 3 - —A 2 - -^-A + -™—) . 
VKJ V ; V 4 240 6 > 480 38,880 38,880/ 

On the other hand, the characteristic equation forT n (54:, 1) is 




The roots of the quadratic factor are (l ± V5J /8(. 404508, —.154508) while the 
roots of<j>*(\) are .6883559, .2331465, .1031093, -.1649678, and -.2337321. 
This shows that there are rules for which the roots of 0(A) and 0*(A) are not 
the same. In the first two cases, the rules were additive, while rule 54 is not 
additive. 



42 



